Efficient Reinforcement Learning for Robots using Informative Simulated Priors -Additional Material-
نویسندگان
چکیده
In this addition to the regular paper, we derive the required derivatives required to implement the informative prior from a simulator in PILCO [1]. First, for completeness, we repeat the derivation of the mean, covariance, and input-output covariance of the predictive mean of a Gaussian process (GP) when the prior mean is a radial basis function network (RBF). Then, we detail the partial derivatives of the predictive distribution with respect to the input distribution.
منابع مشابه
Using BELBIC based optimal controller for omni-directional threewheel robots model identified by LOLIMOT
In this paper, an intelligent controller is applied to control omni-directional robots motion. First, the dynamics of the three wheel robots, as a nonlinear plant with considerable uncertainties, is identified using an efficient algorithm of training, named LoLiMoT. Then, an intelligent controller based on brain emotional learning algorithm is applied to the identified model. This emotional l...
متن کاملLearning state representations with robotic priors
Robot learning is critically enabled by the availability of appropriate state representations. We propose a robotics-specific approach to learning such state representations. As robots accomplish tasks by interacting with the physical world, we can facilitate representation learning by considering the structure imposed by physics; this structure is reflected in the changes that occur in the wor...
متن کاملState Representation Learning in Robotics: Using Prior Knowledge about Physical Interaction
State representations critically affect the effectiveness of learning in robots. In this paper, we propose a roboticsspecific approach to learning such state representations. Robots accomplish tasks by interacting with the physical world. Physics in turn imposes structure on both the changes in the world and on the way robots can effect these changes. Using prior knowledge about interacting wit...
متن کاملActive Reward Learning from Critiques
Learning from demonstration algorithms, such as Inverse Reinforcement Learning, aim to provide a natural mechanism for programming robots, but can often require a prohibitive number of demonstrations to capture important subtleties of a task. Rather than requesting additional demonstrations blindly, active learning methods leverage uncertainty to query the user for action labels at states with ...
متن کاملActive Learning from Critiques via Bayesian Inverse Reinforcement Learning
Learning from demonstration algorithms, such as Inverse Reinforcement Learning, aim to provide a natural mechanism for programming robots, but can often require a prohibitive number of demonstrations to capture important subtleties of a task. Rather than requesting additional demonstrations blindly, active learning methods leverage uncertainty to query the user for action labels at states with ...
متن کامل